Objective Distance Measures for Spe Concatenative Speech
نویسندگان
چکیده
In unit selection based concatenative speech systems, join cost, which measures how well two units can be joined together, is one of the main criteria for selecting appropriate units from the inventory. The ideal join cost will measure perceived discontinuity, based on easily measurable spectral properties of the units being joined, in order to ensure smooth and natural-sounding synthetic speech. In this paper we report a perceptual experiment conducted to measure the correlation between subjective human perception and various objective spectrally-based measures proposed in the literature. Our experiments used a state-of-the art unit-selection text-to-speech system: rVoice from Rhetorical Systems Ltd.
منابع مشابه
Objective Distance Measures for S Concatenative Speec
In unit selection based concatenative speech systems, join cost, which measures how well two units can be joined together, is one of the main criteria for selecting appropriate units from the inventory. The ideal join cost will measure perceived discontinuity, based on easily measurable spectral properties of the units being joined, in order to ensure smooth and natural-sounding synthetic speec...
متن کاملObjective distance measures for spectral discontinuities in concatenative speech synthesis
The quality of unit selection based concatenative speech synthesis mainly depends on how well two successive units can be joined together to minimise the audible discontinuities. The objective measure of discontinuity used when selecting units is known as the join cost. The ideal join cost will measure perceived discontinuity, based on easily measurable spectral properties of the units being jo...
متن کاملA perceptual evaluation of distance measures for concatenative speech synthesis
In concatenative synthesis, new utterances are created by concatenating segments (units) of recorded speech. When the segments are extracted from a large speech corpus, a key issue is to select segments that will sound natural in a given phonetic context. Distance measures are often used for this task. However, little is known about the perceptual relevance of these measures. More insight into ...
متن کاملSpectral Continuity Measures at Mandarin Syllable Boundaries
In Text-to-Speech (TTS) systems based on concatenative synthesis, the naturalness of synthetic speech is highly affected by the spectral continuities at the concatenation point. In this paper, we focused on 4 kinds of syllable boundaries in mandarin and used several spectral distance measures combined with time derivatives distance measures to predict their audible discontinuities. A perceptual...
متن کاملComparing spectral distance measures for join cost optimization in concatenative speech synthesis
In concatenative synthesis the join cost function can be related to the probability of a perceived discontinuity at the join. Therefore it is important that the distance measures in the cost function correlate highly with human perceived discontinuities. In this paper the results of a listening test on joins in two Norwegian long vowels: /A:/ and /e:/, is presented. Five spectral distance measu...
متن کامل